Yelp Business Info Scraper
Pricing
$19.99/month + usage
Yelp Business Info Scraper
🏪 Yelp Business Info Scraper pulls structured data from Yelp—names, ratings⭐, reviews, phone☎️, address📍, hours⏰, categories, price, photos & links. 📊 Perfect for local SEO, lead gen, market research & competitor analysis. 🔎 Fast, accurate, export-ready CSV/JSON. 🚀
Pricing
$19.99/month + usage
Rating
0.0
(0)
Developer
ScrapeFlow
Maintained by CommunityActor stats
0
Bookmarked
4
Total users
0
Monthly active users
2 days ago
Last modified
Categories
Share
Scrape detailed business information from Yelp business pages at scale. This Apify Actor extracts title, rating, reviews, address, phone, hours, images, categories, services, business owner, "about", review highlights, and more from one or many Yelp URLs.
How it works
The actor performs a Chrome-impersonated HTTP request (via curl_cffi) to each Yelp business page, then parses the embedded Apollo/GraphQL cache (<script data-apollo-state>) to produce a structured record. It does not use a headless browser.
Anti-blocking strategy
- Apify residential proxy for every request.
curl_cffiChrome 131 impersonation so the TLS/JA3 fingerprint matches a real browser.- Retries with exponential backoff — up to 3 attempts per URL.
- DataDome challenge solving — when a
var dd = {...}challenge withrt='i'is detected, the actor calls CapSolverAntiDatadomeTaskwith the challenge parameters, then refetches the URL with the returneddatadomecookie. - Hard-reject skip —
rt='c'challenges have no puzzle, so the actor stops retrying that IP pool instead of burning attempts. - Translate-proxy fallback — as a last resort the URL is fetched through
translate.google.com, which Yelp's CDN treats as benign and which preserves the Apollo JSON intact. - Live saving — every record is pushed to the dataset as soon as it's parsed.
Input
| Field | Type | Required | Description |
|---|---|---|---|
| startUrls | array | Yes | List of Yelp business page URLs (e.g. https://www.yelp.com/biz/east-village-pizza-new-york). |
| proxyConfiguration | object | No | Apify proxy settings (collapsed by default). Defaults to residential proxy. |
Example input:
{"startUrls": [{ "url": "https://www.yelp.com/biz/east-village-pizza-new-york" }],"proxyConfiguration": { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"] }}
Output
Each business is pushed to the default dataset with this shape:
| Field | Description |
|---|---|
| title | Business name |
| rating | Numeric rating (e.g. "4.1") |
| reviewCount | e.g. "651 reviews" |
| isClaimed | "Claimed" or "Unclaimed" |
| priceLevel | e.g. "$", "$$" |
| categories | Comma-separated categories |
| fullAddress, city, state, zipcode | Address fields |
| phoneNumber | Formatted phone |
| images | Array of large image URLs |
| website | Business website URL |
| hours | Map of Mon/Tue/…/Sun (plus upcoming special dates) to hour strings, with Open now/Closed now appended on today |
| businessOwnerName, about | Owner display name and combined specialties/history |
| reviewhighlights | Array of review-highlight snippets |
| businessServices | Object of service name → boolean (delivery, take-out, accessibility, payment, etc.) |
| yelp_biz_id | Yelp internal business ID |
| timestamp | UTC scrape time |
| url, is_page_not_found, status | URL, 404 flag, "SUCCEEDED" or "FAILED" |
Run via API (cURL)
curl -X POST \-H "Content-Type: application/json" \-H "Authorization: Bearer YOUR_APIFY_TOKEN" \-d '{"startUrls":[{"url":"https://www.yelp.com/biz/east-village-pizza-new-york"}]}' \"https://api.apify.com/v2/acts/YOUR_ACTOR_ID/runs?token=YOUR_APIFY_TOKEN"
FAQ
Why did some URLs fail? A page may be removed (is_page_not_found: true), the IP pool may be hard-rejected (rt='c'), or the translate fallback may have returned a short body. Check the log for [fetch] messages.
Cautions
- Only public Yelp pages are scraped.
- You are responsible for complying with applicable laws (privacy, data protection, terms of use).